A Novel Replication Strategy in Data Grid Environment with a Dynamic Threshold

نویسندگان

  • Sheida Dayyani
  • Mohammad Reza Khayyambashi
چکیده

Data Grid is a type of Grid Computing systems whichis designed to provide geographically distributed data resources to large computational problems that require mining and evaluating large amounts of data. Managing this data in a centralized location increases the data access time and hence much time is taken to execute the job. So to reduce the data access time, "Replication" is used. Data replication is known as an important optimization technique that aims to improve data access time and toutilize network and storage resources efficiently.Since the data files are very large and the Grid storagesare limited, managing replicas in storage for the purpose of more effective utilization requiresmore attention.In this paper, a novel data replication strategy, called Dynamic Hierarchical Replicationwith Threshold (DHRT) is proposed. This strategy is an enhanced version of the Dynamic Hierarchical Replication (DHR)strategy that uses a new threshold for characterizing the number of appropriate sites for replication. Appropriate sites have the higher number of access for that particular replica from other sites. It also minimizes access latency by selectingthe best replica when various sites hold replicas. The proposed replica selection strategy selects thebest replica location for the users’ running jobs by considering the replica requests that are waite in thestorage and number of stored file. The simulated results with OptorSim, i.e. European Data Grid simulatorshow that the DHRT strategy gives better performance compared to the other algorithms and preventsthe unnecessary creation of replicas which leads to efficient storage usage.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...

متن کامل

A Survey of Dynamic Replication Strategies for Improving Response Time in Data Grid Environment

Large-scale data management is a critical problem in a distributed system such as cloud,P2P system, World Wide Web (WWW), and Data Grid. One of the effective solutions is data replicationtechnique, which efficiently reduces the cost of communication and improves the data reliability andresponse time. Various replication methods can be proposed depending on when, where, and howreplicas are gener...

متن کامل

Dynamic Replication based on Firefly Algorithm in Data Grid

In data grid, using reservation is accepted to provide scheduling and service quality. Users need to have an access to the stored data in geographical environment, which can be solved by using replication, and an action taken to reach certainty. As a result, users are directed toward the nearest version to access information. The most important point is to know in which sites and distributed sy...

متن کامل

Reliability and Availability Improvement in Economic Data Grid Environment Based On Clustering Approach

Abstract - One of the important problems in grid environments is data replication in grid sites. Reliability and availability of data replication in some cases is considered low. To separate sites with high reliability and high availability of sites with low availability and low reliability, clustering can be used. In this study, the data grid dynamically evaluate and predict the condition of t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014